Pitch Marks at Peaks or Valleys?
نویسندگان
چکیده
This paper deals with the problem of speech waveform polarity. As the polarity of speech waveform can influence the performance of pitch marking algorithms (see Sec. 4), a simple method for the speech signal polarity determination is presented in the paper. We call this problem peak/valley decision making, i.e. making of decision whether pitch marks should be placed at peaks (local maxima) or at valleys (local minima) of a speech waveform. Besides, the proposed method can be utilized to check the polarity consistence of a speech corpus, which is important for the concatenation of speech units in speech synthesis.
منابع مشابه
A two-phase pitch marking method for TD-PSOLA synthesis
This paper describes a robust two-phase pitch marking method based on peak-valley decision and dynamic programming. In the first phase, we select either peaks or valleys for pitch mark candidates according to its similarity to an estimated pitch curve. In the second phase, we define state and transition probabilities, and then employ dynamic programming to find the most likely pitch marks. We h...
متن کاملPitch Marking Based on an Adaptable Filter and a Peak-Valley Estimation Method
In a text-to-speech (TTS) conversion system based on the time-domain pitch-synchronous overlap-add (TD-PSOLA) method, accurate estimation of pitch periods and pitch marks is necessary for pitch modification to assure an optimal quality of the synthetic speech. In general, there are two major issues on pitch marking: pitch detection and location determination. In this paper, an adaptable filter,...
متن کاملMeasuring pitch range
The literature offers at least two methods to annotators for characterizing the pitch range of a prosodic phrase. One method, included in the ToBI framework, is in terms of the distance between the F0 maximum of the phrase (HiF0) and the speaker’s utterance-final pitch (LoF0). The other method, proposed by Ladd and by ‘t Hart and colleagues, is in terms of the distance between pitch peaks and p...
متن کاملMind the Peak: When Museum is Temporarily Understood as Musical in Australian English
Intonation languages signal pragmatic functions (e.g. information structure) by means of different pitch accent types. Acoustically, pitch accent types differ in the alignment of pitch peaks (and valleys) in regard to stressed syllables, which makes the position of pitch peaks an unreliable cue to lexical stress (even though pitch peaks and lexical stress often coincide in intonation languages)...
متن کاملEffects of pitch range variation on f0 extrema in an imitation task
A central issue in speech intonation research concerns how fundamental frequency (f0) variation relates to phonological categories. The hypothesis was tested that pitch range variation which affects whether one syllable is higher or lower than another would elicit categorical shifts in f0 extremum timing in an imitation task. Participants heard synthetic versions of the phrase Some lemonade wit...
متن کامل